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THE BFFECTS OF VERBAL APPROVAL AND DISAPPROVAL UPON 
THE PERFORMANCE OF THIRD AND FOURTH GRADE 
CHILDREN ON FOUR SUB TESTS OF THE 
WECHSLER INTELLIGENCE 
SCALE FOR CHILDREN 

Since the advent of the Individual intelligence teat, the question 
of the effect of examiner-examinee Interaction upon teat performance has 
been raised. While recognising that the test manuals provide stringent 
regulations to be followed In order to control and standardise the con- 
ditions for test administration, the variability of examiner-examinee In- 
teraction and Its of feet upon test performance requires note careful 
examination. The problem of the Influence of different external factors 
on intelligence test results has been studied, but there are limited 
experimental data and conflicting results. 

There is general agreement In the literature that the testing 
situation should elicit optimal performance of the examinee on a test 
of mental ability (Klugnan, 1944 1 Terman, 1916 1 Terman and Merrill, 1937 1 
Tennan and Her 111, I960* Wecheler, 1949), Physical conditions and ad- 
ministration procedures can be specified and controlled, or at least noted 
when varying from normal. Although It U possible to describe the nature 
of the social relationship between the examiner and tha examinee, It Is 
difficult to control or assess the effect of this social relationship 
upon test performance in examiner-examinee interaction, 

Tetman as aarly as 1916 nada the point that praise of a child’s 
efforts In the testing situation contributed more than anything else to 
satisfactory rapport. The chtld should be kept interested, confident, 
and at hla beat level of effort* "Exclamations like 'Fine}, 1 'Splendid!,' 
etc, should be used lavishly" during tha examination (Terman, 1916, 
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p. 125). In a later publication, Terman and Morrill (1937) advised the 
examiner to enlist the subject's beat efforts, otherwise the resulting 
score would be less than optimal to some unknown degree. The su K ject's 
best efforts are to be enlisted by the establishment and maintenance 
of adequate rapport. They felt that "It la wise to praise frequently 
and generously" (p, 57), 

In their latent revision of the Sten£ord>Blnct Intelligence 
Scale, Tertaan and Merrill (1960) reiterate the Importance of rapport 
to elicit the subject's best efforts and maintain both high motivation 
and optimal performance level throughout the testing session. The 
examiner is advised to encourage the subject through freq^ont and 
generous praise, but this approval should bo given for effort rather 
than success on a particular response. Spontaneous comments such as 
"Good!" and "Pine!" should be used to elicit the subject's best efforts. 
However, under no circumstances should the examiner show dissatisfaction 
with a response, 

Wechsler (1949) In his Manual for the Wechsler Intelligence 
Scale for Children (WISC) is less clear and specific than Terraan and 
Merrill (1960) in his consideration of the effect of examlnefexamlnee 
Interaction upon the performance of the child In the testing situation, 
Like Tertaan and Merrill (>960) he hopes that the examiner will secure 
scores which represent the optimum ability of the child. In comparing 
his general testing considerations with those in the Stanford-Binet 
Manual (1960), It appears that he is encouraging an Interpersonal tela- 
tlonshlp that la more neutral than that of Tertian and Merrill, According 




3 



to WechBler friendliness and warmth should characterize the examiner's 

Approach; that In a school situation the examiner should be known and 

accepted by the child to be tested and by the other children In his 

group. In the testing situation supportive expressions are appropriate 

If the subject does not do well on a test. Yet, he cautions against 

the use of approval If a subject la making an effort or experiencing a 

nodicun of success In the tasks, 

children vary In their reaction to commendation from an adult; 

In no case should the examiner Indicate dissatisfaction with 
a response as given ncr build up an expectancy of approval In 
the subject so that giving no comment would be Interpreted by 
him as disapproval (Wechsler, 1949, p. 19), 

Wechsler In his general testing procedures Implies a somewhat 

neutral attitude as the proper frame of reference for the examiner In 

presenting the test materials. This neutrAl set uay appear aa e negative 

stimulus to some Individuals, It 19 our contention that u no comment" 

type of examiner behavior In the testing Interview Is an ambiguous 

stimulus since children, ee a result of prior social conditioning, come 

to the testing situation with the expectation of approval or disapproval 

from an adult. If an examiner does not use encouragement or express 

approval In some manner with some frequency, the examiner's "neutral" 

behavior is likely to be responded to In a variety of ways, thus the test 

results are more likely to be leas than an optimal measure of Intelligence, 

Numerous studies have been conducted to observe the effect of 

material and social Incentives upon human and animal behavior. Of par* 

ticular Interest to us were those studies which examined the effect of 

verbal Incentives, l,e., praise and blame, upon the performance of school 

age children, Nurlock (1924; 1923a; 1923b) was one of the early psycho!* 




oglflta to attempt to assess the effect of praise and reproof upon in- 
telligence test performance in a testing situation. She first studied 
the effects of verbal incentives upon group intelligence test performance 
of third, fifth, and eighth grade children. Zn administering an intel- 
ligence test using the test-retest method with a one week interval, she 
used praise, reproof, and control groups. Her conclusion was that 
nolther praise nor reproof was superior, but that both tended to result 
in better performance than did practice alone, with the treatment having 
no significant differences by age, sex, race, or intelligence. In a 
follow-up study (1925a) she again found that the verbal Incentives of 
praise and blame tended to raise 10 scores ooro than practice. 

Following Hurlock'a early efforts, several researchers have tried 
to demonstrate the effects of incentives upon test performance, Klugman 
()944), in using the 1937 edition of the Stanford-Blnet Intelligence 
Scale, conducted a cross cultural study including Negro and white chil- 
dren, He found that, although not statistically significant, money was 
slightly more effective than verbal praise and that Negro children re- 
sponded to the money incentive better then the white children, However, 
there was no control group) thus there was no means of determining to 
what extent incentives affected performance, Tiber (1963), in using the 
Stenfcrd-Blnet Intelligence Scale, form L-M to study the effect of verbal 
incentives, found no evidence that they make a difference in test perform- 
ance, The only statistically significant differences noted rare those 
between the various class and caste groups which included middle and 
lower class white and Hegro children. In a study by Wlllcutt and Kennedy 
(1963) with fourth grade children of lower-middle and upper-lower class, 
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praise was found to be more effective than either reproof or no Incen- 
tive in performance on a discrimination task, There was no maul 
difference between level of intelligence and effectiveness of verbal 
incentives, 

In a review of 33 praise and blame studies conducted over a 
fifty year period, Kennedy and WiXlcutt (1564) concluded that praise 
tends to facilitate learning and blane tends to have a debilitating 
effect. In what appears to be somewhat contradictory findings to the 
above studies, Marshall (1565) reviewed 32 incentive studies (only five 
of which were Included in the Kennedy and Ulllcutt review) to assess 
the use of punishment incentives and reported that reproof is more 
effective in terns of its effects on subsequent performance when compared 
to praise and control, or neutral conditions. Perhaps this conflict can 
in part be understood on the basis of whether the reinforcement was 
scheduled directly following each student response or after the comple- 
tion of the situation, /.s suggested by Cofer and Appley (1566), the 
effects of praise and reproof may differ according to whether the rein- 
forcement is contingent upon reinforcing specific responses to a task 
or upon incentives that ero provided to overall performance . 

A review of the research within the verbal operant conditioning 
paradigm, in which experimenter variables were used as independent 
variables, provides evidence that the subject's verbal behavior. can be 
manipulated by the experimenter's verbal behavior, The use of social 
approval through auch generalised conditioned tclnforcers as "Good" and 
"Fine” haa been demonstrated to bring the verbal behavior of the^ subject 
under the control of the experimenter (Gtecnspoon,. 1562} Kenfer, 19 58 1 
Kraaner, 1958} Taffel, 1555} Verplanckj 1955} MicV.es, 1956). 
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In these verbal conditioning studies, the reinforcement ranged 
from a minimal verbal cue ouch as "mran-hn" to psychoanalytically 
derived interpretations. Verbal reinforcement by the experimenter was 
shown to affect the type of verbal behavior and the frequency of that 
verbal behavior such as an Increase in self-reference statements, types 
of verbs used, sentence length, opinions stated, and use of personal pro- 
nouns. 

Considering that the research indicates thet verbal incentives can 

Improve performance on vioual-motor tasks and that one's verbal behavior 

can be systematically modified by a verbal relnforcer, ve formulated a 

problem of trying to modify an individual's performsni i on subtests of the 

VIISC through the use of social reinforcerB by the exaniner, Llttcll 

;1960) in a review of a decade of research on the WISC, stated that, 

there appears to bo etrong reason to suspect that WISC scores 
are affected systematically by many variables other than in- 
telligence, but little information about the exact nature of 
these variables and the relationships Involved is available 
(p. 153). 

Llttell further stated that specific research with the WISC is needed to 
study (1) variables in the relationship between examiner and examinee, 
and (2) the influence of circumstances upon the examination. 

HYPOTHESES 

The purpose of this study was to determine the effecta of three 
modes of test administration upon the performance of third and fourth 
grade boys and girls on four aubteats of the WISC, Social reinforcers, 
namely, verbal approval and disapproval, were systematically presented to 
the examinees after specified items and between subtesta to determine 
whether there was any differential effect on test performance, 
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The following null hypotheses were tested; 

1, There Is no significant difference between the mean scores of 
the group receiving Approval end the group receiving Disapproval 
on selected subtests of the UISC, 

2, There Is no significant difference between the mean scores of 
the group receiving Approvel and the group receiving Neutral 
treatment on selected subtests of the UISC, 

3, There Is no significant difference between the mean scores of 
toe group receiving Neutral treatment and the group receiving 
Disapproval on selected eubtoata of the UISC, 

PROCEDURE 

The subjects for the study were 90 third end fourth grade pupils 
who were randomly selected from a total population of 111 pupils in the 
third and fourth grades at Florida State University Elementary School, 
They were randomly assigned to six groups identified in pairs as Dis- 
approval, Neutral, and Approval (D, H, A), Table 1 ahows the neen IQ, 

SD, and range for each group, A majority of the subjects were members 
of upper middle class families residing in realdentlal areas in and 
around a capital city of 60,000 persons., Occupational designations of 
ivost of the fathers were in the professional, technical, and managerial 
categories. 

Four selected aubteets from the WISC were administered to all 
subjects; Arithmetic, Digit Span, Picture Arrangement, and Block Design, 
It was hypothesised that these eub tests would be most sensitive to treat- 
ment effects and could be scored objectively. 

Three treatments, namely, verbal disapproval (D), verbal approval 
(A), and neutral (N), were used. Verbal approval was defined by the 
statements "Good!," "Fine! 'that was good," 'that was fine," Such 
statements were made after the eubject'o response to the first item 
(whether right or wrong) in each aubtett and between aubtesta. 
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Disapproval was defined by the etutomonte* "I thought you oould do hotter 
than that" (after the response to the first Item, whether right or wwng. 
In eaoh subtest) und "That wasn't so good" (between each of the subtests)* 
In all cases the statement was made while looking at the child* For 
the Neutral group* there was no conscious or scheduled attempt to provide 
systematic verbal approval or disapproval* The statement* "Now let's 
try these*" was given to all subjects In the three groups lrnedlately 
before starting the next subteat* Standard test administration pro- 
cedures os prescribed in the WISC Manual were followed with all three 
groups. The three treatments were alternated so that no single treat- 
ment was United to any given day or tine of the day. 



TABLE 1 

EOUIVALENCL OF GROUPS FOR INTELLIGENCE® 





Group'* 


Mean IQ 


SD 


IQ Range 


Disapproval (N°30) 
Examiner 1 
Exetalner 2 


112*4 

112*3 


10.6 

10.3 


95-133 

98-131 


Total 


112*4 


10.2 


95-133 


Neutral (N*30) 
Examiner l 
Examiner 2 


112*9 

112*7 


13*1 

10*1 


93-132 

97-134 


Total 


112.8 


11.7 


93-134 


Approval (N»30) 
Examiner 1 
Examiner 2 


112.5 

112.6 


9*6 

10*4 


97-133 

89-126 


Total 


112.5 


10,1 


89-133 


,>^Ihe Primary Mental Abilities Test was 
weeks prior to the expsrlnent, 
t>Each of the tlx groups had an N of IS* 


administered by Examiner 


2 six 
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Both examiners used the sane physical facilities, but at different 
times, and followed standard procedures before and after the actual test- 
ing situation. Each examiner escorted the child from the classroom to 
the testing room which had a table and two chairs. While walking from 
the subject's classroom to the testing room, the examiner attempted tc 
establish rapport with the child. A positive Interpersonal relationship 
established through chit-chat before beginning an experiment Is more 
likely to make the experimenter's use of reinforcement cues more effective 
(Sapolsky, 1960; Solley and Long, 1958). Before having the children In 
the D group return to their room, the examiner said, "You fooled me, you 
did better than I thought you would, In fact you did very well." No 
disturbing carryover of anxiety generated by the experiment was reported 
by the teachers. 



RESULTS 

For the analysis of the data, the standard scaled scores for the 
UISC were used. The four subtest scores were summed and mean standard 
scores for each group (D, N, A) were computed (See Table 2), A one-way 
analysis of variance was used to test the significance of the differences 
among the means of the three treatments. The obtained F ratio (5.05) 
was significant at the .01 level of confidence. (Critical F, with 2 and 
87 degrees of freedom, was 4.85.) Tukey's HSD test was used to make all 
pairwise comparisons among means (Kirk, I960). The results are shown In 
Table 3. 

A two-way analysis of variance was made to determine whether there 
were uny significant differences between examiners according to treatment 
administered (See Table 4). 
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TABLE 2 



MEAN STANDARD SCORES AND STANDARD DEVIATIONS 
ON FOUR SUBTESTS OF THE WISC ACCORDING 
TO TREATMENT (D, N, A) 





Croup® 


Mean 

Standard Score 


SD 


Standard Score 
Range 



Disapproval 



Examiner 1 


42 t 9 


7.9 


30-61 


Examiner 2 


40.7 


6.3 


27-50 


Total 


41.8 


7.2 


27-61 


Neutral 








Examiner 1 


45.3 


9.0 


29-60 


Examiner 2 


43.1 


6.0 


34-56 


Total 


44.2 


7.7 


29-60 


Approval 








Examiner 1 


47.6 


7.3 


34-59 


Examiner 2 


47.6 


3.5 


40-53 


Total 


47.6 


5.7 


34-59 


Grand Total 


44,5 


7.3 


27-61 


3 

Each treatment 


group had an N 


of 30 with Examiner 1 


and Examiner 


each having 


an N of 15 in 


each treatment group. 
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TABLE 3 



MEANS , DIFFERENCES AMONG MEANS, AND SIGNIFICANCE OF 
DIFFERENCES AT .05 LEVEL OF CONFIDENCE 
FOR THE APPROVAL, NEUTRAL, AND 
DISAPPROVAL TREATMENTS 



Differences Among Means 


Means M A M N 


m d 


M a «= 47.6 - 3.4 


5.8* 


t 

CM 

Mf 

0 

sf 


2,4 


Mp = 41.8 


m 


*p< .0^ where HSD = 4.3S, using a two-tailed test. 





TABLE 4 

ANALYSIS OF VARIANCE FOR TREATMENT (D 
BY EXAMINER (1, 2> a 


, N, A) 







Sum of 




Mean 


Required F 


Source 


Squares 


df 


Squares 


F for p < .05 


Examiner 


48.4 


1 


48.5 


^.1 4.0 


Treatment 


504.9 


2 


252.4 


5.0 3.1 


Interaction 


24.2 


2 


12.1 


<11 3.1 


Within 


4272.9 


84 


50.9 




Total 


4850.4 


89 







a Both independent variables (examiner and treatment) were assumed to 
be fixed. 
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DISCUSSION 

Examination of the mean standard scores of each group (D, N, A) 
Indicates a differential effect between the typo of treatment or verbal 
reinforcement and performance on the four eubtests of the UISC (See Table 
2), The verbal disapproval caused a decrement In performance and the 
verbal approval an increase when compared to a group which was given no 
such vent', reinforcement. Null hypothesis 1 was rejected since the 
group receiving verbal approval (A) scored significantly higher than the 
group (D) which received verbal disapproval. Although In actual practice 
it is not likely that this amount of negative verbal behavior would be 
present In the examiner, the finding does demonstrate the effect of 
such verbal behavior upon the performance of third and fourth grade 
children on standardised test Items, 

Null hypothesis 2 was not rejected since the group receiving 
verbal approval (A) did not score significantly higher than the group 
(N) which received no scheduled verbal reinforcement. The directional 
trend was in favor of the group that received verbal approval. If one 
extrapolated the average difference of .85 IQ points per subtest, the 
difference In comparing the two treatments for the complete WISC 
would be eight 10 points. Although not statistically significant in 
this experiment, this finding appears to have significance for further 
research since the treatment given these two groups may approximate the 
actual behavior of examiners who administer the UISC, 

Null hypothesis 3 was not rejected since there was no significant 
difference between the performance of the subjects receiving no scheduled 
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verbal approval or disapproval (N) and those receiving verbal disapproval 
(D). While the difference was not accepted as significant, the results 
were in the direction of the N group performing higher. The M no 
comment" type of behavior in which no statements relevant to the per- 
formance were made positively or negatively to the subjects immediately 
before or during the testing period, could have been perceived by the 
subjects 88 ambiguity. However, the general effect presumably was more 
of a negative outcone in test performance than a positive one. 

Although the results do not show any significant differences 
between examiners (See Tables 2 and 4), the trend is unidirectional for 
the D and N groups with no difference in the A groups. Examiner 1 is 
11 years older and larger in build. Besides being chance differences, 
examiner differences (age, height, personality, etc) or the fact that 
many of the subjects knew Examiner 2 as a group test administrator in 
the school prior to the experiment may account for this trend. Kanfer 
and Karas (1959) report that the attitudes and perceptions of the sub- 
ject toward the experimenter that are based upon a pre-experimental 
relationship can be expected to manifest themselves in the subject's 
responsiveness to the experimenter's verbal reinforcement on a subse- 
quent conditioning task. 

Subject responses to the verbal disapproval were varied. Some 
subjects reacted by responding more quickly to the items presented by 
the examiner while with others the tine Interval increased. Others 
imitated the speech of the examiner by word accent and volume. Some 
subjects showed evidence of physical changes in behavior such as increase 
in breathing rate, sighing, unproductive movements, shakiness, (especially 
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with the hands), whispered responses, a raise In the pitch of the voice, 
and putting the thumb In the mouth. 

The results of this experiment suggest that the verbal behavior 
of the examiner within a testing situation can significantly Influence 
the test performance of third and fourth grade boys and girls, at least 
of those who are reared In upper middle class families. Giving verbal 
disapproval and providing verbal approval have significantly different 
effects, with the latter resulting in higher test performance. While 
the social reinforcement given In real testing situations may not be as 
frequent or Intense as In this study, the amount of approval (support, 
praise, or encouragement) given by different examiners Is likely to vary 
considerably, and thus needs to be recognized as an examiner-examinee 
variable that can Influence test results. 

This study suggests areas for further research In examiner-examinee 
interaction and the differential effects of exauiner and examinee varl* 
ables upon test performance. Replicated studies should be conducted to 
determine whether these results hold true for children representative of 
our population as well as children of different age levels. Children 
from a different social-cultural milieu, for example, the culturally dis- 
advantaged, should be Included. Those children who have bean trained 
by their parents through negative reinforcement might conceivably respond 
differently to the treatment In the testing situation than those who have 
been trained primarily through positive reinforcement. Additional treat- 
ment groups could be Included to provide treatment levels of Increased 
verbal approval and of Increased verbal disapproval to determine whether 
there is a linear relationship. Further research might include other 




subtests of the WISC, examiners of both sexes, several examiners and 



video tape recorded test sessions for analysis of verbal 
verbal behavior. Other examinee variables that might be 
differences according to treatment are level of anxiety, 
level, and sex. 
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studied for 
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